Code Visibility and Accessibility

The R code chunks in this report are hidden by default. The Code drop down at the beginning of the report allows for all chunks to be shown at once for better readability. Alternatively, individual code sections can be shown by clicking the Code button above each chunk. Clicking the button again will hide the code.

# Global chunk options
knitr::opts_chunk$set(
  message = FALSE,
  warning = FALSE,
  comment = NA,
  fig.pos = 'H'
)

# Libraries
library(dplyr)
library(tidyverse)
library(ggplot2)
library(plotly)
library(ggthemes)
library(knitr)
library(simmer)
library(simmer.plot)

Acknowledgement

Grammarly, an AI-powered writing assistant tool, enhanced the clarity, coherence, and grammatical accuracy of initial notes and the final draft. Its assistance in identifying errors and suggesting style and vocabulary improvements significantly contributed to the quality of this report.

Problem Definition and Project Objectives

ABC Co. is a manufacturing company serving clients in Chile, Morocco, Portugal, and Greece. Its production process consists of three main stages: moulding, inspection, and packaging. Each stage has varying processing times depending on the client’s country, with a degree of unpredictability.

Customer orders arrive regularly but not at fixed intervals, averaging one order every 0.6 minutes. The facility currently has 10 moulding machines, 9 inspection tables, and 9 packaging machines.

This project aims to analyse and simulate the current production process to uncover inefficiencies and explore how changes in resource allocation might improve performance. Specifically, the study will focus on identifying bottlenecks, measuring average waiting times at each stage, and calculating the financial impact of delays—particularly between the moulding and inspection phases. The ultimate goal is to propose data-driven improvements that enhance operational efficiency and reduce costs.

Conceptual Model

Process Overview

ABC Co.’s production system consists of three consecutive stages: moulding, inspection, and packaging. Products are manufactured for clients from Chile, Morocco, Portugal, and Greece. Processing times at each stage vary by country and follow exponential distributions.

# Define production parameters by country
production_parameters <- data.frame(
  Country    = c("Chile", "Morocco", "Portugal", "Greece"),
  Moulding   = c(5.3, 4.4, 4.6, 2.5),
  Inspection = c(3.7, 1.3, 5.2, 3.4),
  Packaging  = c(3.8, 3.9, 3.8, 3.1)
)

# Display the table
kable(production_parameters, caption = "Production Parameters by Country")
Table 1: Production Parameters by Country
Country Moulding Inspection Packaging
Chile 5.3 3.7 3.8
Morocco 4.4 1.3 3.9
Portugal 4.6 5.2 3.8
Greece 2.5 3.4 3.1

Order Arrivals

Customer orders arrive at random intervals, following a normal distribution with a mean inter-arrival time of 0.6 minutes and a standard deviation of 0.1 minutes.

Resources and Flow

The facility is equipped with 10 moulding machines, 9 inspection tables, and 9 packaging machines. Products move through each stage in sequence. If a resource is busy, items wait in a queue until it becomes available. (Figure 1).

# Include an external image
knitr::include_graphics("Workflow.png")
Workflow Diagram of the Production Process

Figure 1: Workflow Diagram of the Production Process

Scenario Analysis

The study explores the impact of adding six machines distributed across the moulding and packaging stages in different combinations (e.g., 1 for moulding and 5 for packaging, 2 and 4, up to 5 and 1). Each scenario ensures that both stages receive at least one additional machine. A key cost factor examined is the delay between moulding and inspection, with each minute of waiting incurring a degradation cost of £150.

Performance Metrics

The simulation focuses on two key metrics:

  • Average waiting time at each stage, calculated as the difference between the total time spent in the system and the processing time at that stage (waiting time = end time – start time – activity time).

  • Average degradation cost due to delays between moulding and inspection.

Assumptions & Simplifications

To streamline the analysis and deliver insights within the project timeframe, several assumptions and simplifications were made:

  • The system operates continuously in a steady state, where long-term performance remains stable.

  • The system’s performance is evaluated after an initial warm-up period to ensure stable and representative results. Ideally, each simulation run would have its warm-up period. However, due to time constraints, a single warm-up period—estimated from a one-replication run—was applied to all replications. This approach is supported by Law (2015, pp. 511–523), who notes that when the system’s stochastic behaviour is consistent across replications, a common warm-up period can be used effectively.

  • Historical order data was used to reflect country-based demand proportions, which are assumed to remain constant over time. Additionally, all orders are assumed to be received and fulfilled on the same day, regardless of the actual timeline in the historical dataset.

  • The waiting time between moulding and inspection is measured as the waiting time at the inspection stage.

  • Machines are assumed to be always available, with no breakdowns, set-up times, or maintenance interruptions.

  • Raw materials and staffing are considered unlimited, removing external supply and labour constraints.

  • All queues follow a First-Come-First-Served policy, and products are processed individually—no batching is involved.

  • The cost of acquiring and installing the six additional machines used in scenario analysis is not considered as part of the cost reduction strategy. This financial aspect is beyond the scope of the current study and is therefore excluded from the simulation and conclusions.

Data Collection and Preparation

The accuracy and reliability of any discrete-event simulation (DES) depend largely on the quality and relevance of the input data. This study used a sample of ABC Co.’s historical order records to analyse ordering patterns across its four customer regions: Chile, Morocco, Portugal, and Greece.

# Load data
data <- read_csv("Data.csv")

# Preview the first few rows
knitr::kable(
  head(data),
  caption = "Preview of the sample of ABC Co.’s historical order records"
)
Table 2: Preview of the sample of ABC Co.’s historical order records
Date From
Sunday, 4 February 2018 Portugal
Friday, 6 December 2019 Portugal
Wednesday, 21 July 2021 Greece
Tuesday, 18 September 2018 Chile
Wednesday, 28 October 2020 Portugal
Wednesday, 12 January 2022 Morocco
# Show summary statistics
summary(data)
     Date               From          
 Length:1000        Length:1000       
 Class :character   Class :character  
 Mode  :character   Mode  :character  

To prepare the data for analysis, the order Date field—stored initially as a text string—was converted to a standard date format to enable time-based insights.

# Load date-handling library
library(lubridate)

# Clean and format the Date column, then sort the data
data <- data |>
  mutate(Date = dmy(gsub("^[A-Za-z]+, ", "", Date))) |>
  arrange(Date)

# Generate a detailed data description
Hmisc::describe(data)
data 

 2  Variables      1000  Observations
--------------------------------------------------------------------------------
Date 
         n    missing   distinct       Info       Mean    pMedian        Gmd 
      1000          0        770          1 2020-06-11      18424      599.8 
       .05        .10        .25        .50        .75        .90        .95 
2018-03-27 2018-06-20 2019-03-19 2020-06-10 2021-08-21 2022-06-12 2022-10-01 

lowest : 2018-01-03 2018-01-04 2018-01-08 2018-01-09 2018-01-14
highest: 2022-12-22 2022-12-23 2022-12-25 2022-12-26 2022-12-28
--------------------------------------------------------------------------------
From 
       n  missing distinct 
    1000        0        4 
                                              
Value         Chile   Greece  Morocco Portugal
Frequency       132      305      236      327
Proportion    0.132    0.305    0.236    0.327
--------------------------------------------------------------------------------

An initial exploratory analysis (Figure 2) revealed that Portugal and Greece together account for over 60% of total orders, while Chile represents just 13%.

ggplotly(
  data |>
    count(From) |>
    mutate(prop = n / sum(n)) |>
    mutate(From = fct_reorder(From, prop, .desc = TRUE)) |>
    ggplot(aes(x = From, y = prop, group = 1)) +
    geom_bar(stat = "identity", fill = "steelblue") +
    labs(
      title = "Proportion of Orders per Country",
      x = "",
      y = ""
    ) +
    theme_tufte(base_family = "Helvetica") +
    theme(
      plot.title = element_text(hjust = 0.5),
      legend.position = "top",
      axis.text.x = element_text(angle = 45, hjust = 1, size = 9)
    ) +
    scale_y_continuous(labels = scales::percent) +
    scale_fill_brewer(palette = "Blues", direction = -1) +
    scale_color_manual(values = c("darkblue", "blue", "royalblue"))
) |>
  layout(
    annotations = list(
      text = "Hover over the plot for details",
      x = 0.5,
      y = 1.05,
      xref = "paper",
      yref = "paper",
      showarrow = FALSE,
      font = list(size = 12, family = "Helvetica")
    )
  )

Figure 2: Order proportions per country.

Monthly trends (Figure 3) showed consistent order activity across time, with only occasional gaps for specific countries, confirming the dataset’s reliability for modelling purposes.

ggplotly(
  data |>
    mutate(year_month = format(Date, "%Y-%m")) |>  # Extract Year-Month
    ggplot(aes(x = year_month)) +
      geom_bar(fill = "steelblue") +
      facet_wrap(~ From) +
      labs(
        title = "Orders per Country",
        x = "Year-Month",
        y = "Count of Orders"
      ) +
      theme_tufte(base_family = "Helvetica") +
      theme(
        plot.title = element_text(hjust = 0.5),
        legend.position = "top",
        axis.text.x = element_text(angle = 90, hjust = 1, size = 5)
      ) +
      scale_fill_brewer(palette = "Blues", direction = -1) +
      scale_color_manual(values = c("darkblue", "blue", "royalblue"))
) |>
  layout(
    annotations = list(
      text = "Hover over the plot for details",
      x = 0.5,
      y = 1.075,
      xref = "paper",
      yref = "paper",
      showarrow = FALSE,
      font = list(size = 12, family = "Helvetica")
    )
  )

Figure 3: Distribution of orders per date and country.

Due to the limited scope of the historical dataset—primarily order dates and countries of origin—synthetic data was generated to support a robust simulation. This included:

# Function to generate simulated orders with processing times
generate_orders <- function(data, production_parameters) {
  
  # Generate arrival times
  orders <- data |>
    mutate(ArrivalTime = rnorm(n(), mean = 0.6, sd = 0.1))
  
  # Merge with production parameters
  orders <- orders |>
    left_join(production_parameters, by = c("From" = "Country"))
  
  # Simulate processing times for each stage
  orders <- orders |>
    mutate(
      MouldingTime   = rexp(n(), rate = 1 / Moulding),
      InspectionTime = rexp(n(), rate = 1 / Inspection),
      PackagingTime  = rexp(n(), rate = 1 / Packaging)
    ) |>
    dplyr::select(-Moulding, -Inspection, -Packaging) |>
    arrange(Date)
  
  return(orders)
}

# Generate a sample of orders
set.seed(123)
orders_generated_sample <- generate_orders(data, production_parameters)

# Display and summarise the sample
knitr::kable(head(orders_generated_sample), caption = "Preview of the Generated Orders")
Table 3: Preview of the Generated Orders
Date From ArrivalTime MouldingTime InspectionTime PackagingTime
2018-01-03 Portugal 0.5439524 7.6529565 5.154851 0.0830213
2018-01-04 Greece 0.5769823 3.8560529 3.418660 6.7997522
2018-01-04 Portugal 0.7558708 7.2667925 2.877702 3.0807833
2018-01-08 Greece 0.6070508 0.0721713 2.984466 4.9803484
2018-01-09 Greece 0.6129288 1.8293483 1.429767 2.3163499
2018-01-14 Portugal 0.7715065 24.0740333 4.265682 0.5342656
summary(orders_generated_sample)
      Date                From            ArrivalTime      MouldingTime      
 Min.   :2018-01-03   Length:1000        Min.   :0.3190   Min.   : 0.001524  
 1st Qu.:2019-03-19   Class :character   1st Qu.:0.5372   1st Qu.: 1.044152  
 Median :2020-06-10   Mode  :character   Median :0.6009   Median : 2.482796  
 Mean   :2020-06-11                      Mean   :0.6016   Mean   : 3.987952  
 3rd Qu.:2021-08-21                      3rd Qu.:0.6665   3rd Qu.: 5.441266  
 Max.   :2022-12-28                      Max.   :0.9241   Max.   :25.734796  
 InspectionTime      PackagingTime      
 Min.   : 0.003819   Min.   : 0.003416  
 1st Qu.: 0.832982   1st Qu.: 1.081277  
 Median : 2.128350   Median : 2.554547  
 Mean   : 3.606196   Mean   : 3.624447  
 3rd Qu.: 4.608425   3rd Qu.: 4.845417  
 Max.   :33.679915   Max.   :27.497896  

To validate the generated data, visual checks (Figure 4) confirmed the stability of arrival intervals.

# Check if arrival times follow a normal distribution
ggplot(orders_generated_sample, aes(x = ArrivalTime)) +
  geom_histogram(aes(y = ..density..), bins = 30, fill = "steelblue", alpha = 0.5) +
  stat_function(
    fun = dnorm,
    args = list(mean = 0.6, sd = 0.1),
    color = "red"
  ) +
  labs(
    title = "Generated Arrival Times vs Normal Distribution Fit",
    x = "Minutes",
    y = "Density"
  ) +
  theme_tufte(base_family = "Helvetica") +
  theme(
    plot.title = element_text(hjust = 0.5),
    legend.position = "right",
    axis.text.x = element_text(angle = 45, hjust = 1)
  )
Normal distribution fit of generated arrival times.

Figure 4: Normal distribution fit of generated arrival times.

Furthermore, additional comparisons (Figure 5) verified generated processing times against known parameters.

# Check if moulding times follow an exponential distribution
ggplot(orders_generated_sample, aes(x = MouldingTime)) +
  geom_histogram(aes(y = ..density..), bins = 30, fill = "steelblue", alpha = 0.5) +
  stat_function(
    fun = dexp,
    args = list(rate = 1 / mean(orders_generated_sample$MouldingTime)),
    color = "red"
  ) +
  labs(
    title = "Generated Moulding Times vs Exponential Fit",
    x = "Minutes",
    y = "Density"
  ) +
  theme_tufte(base_family = "Helvetica") +
  theme(
    plot.title = element_text(hjust = 0.5),
    legend.position = "right",
    axis.text.x = element_text(angle = 45, hjust = 1)
  )
Exponential distribution fit of generated moulding times.

Figure 5: Exponential distribution fit of generated moulding times.

This modelling approach accurately reflects ABC Co.’s operational dynamics while enabling a flexible simulation environment. It supports confident scenario testing and provides a solid foundation for making informed, data-driven operational decisions.

Simulation Model

The simulation model replicates ABC Co.’s production process using a queueing-based approach for each stage: moulding, inspection, and packaging. Each stage is modelled as an M/M/c queue, where c represents the number of servers (i.e., machines) operating at that stage. The system includes 10 moulding machines, 9 inspection tables, and 9 packaging machines in the current configuration.

Processing times—already generated using exponential distributions with country-specific parameters—are integrated into the simulation logic to reflect real-time variability in production times.

The primary goal of the simulation is to estimate average waiting times at each stage, factoring in machine availability and queuing delays.

library(skimr)

moulding_machine = 10 
inspection_table = 9 
packaging_machine = 9

# Simulation function
simulate_plant <- function(orders, n_moulding = moulding_machine, n_inspection = inspection_table, n_packaging = packaging_machine) {
  
  # Create the simulation environment
  env <- simmer("SolCo Plant")
  
  # Add resources
  env |>
    add_resource("Moulding", capacity = n_moulding) |>
    add_resource("Inspection", capacity = n_inspection) |>
    add_resource("Packaging", capacity = n_packaging)
  
  # Define the trajectory (process flow)
  traj <- trajectory("Order Process") |>
    seize("Moulding", 1) |>
    timeout_from_attribute("moulding_time") |>
    release("Moulding", 1) |>
    
    seize("Inspection", 1) |>
    timeout_from_attribute("inspection_time") |>
    release("Inspection", 1) |>
    
    seize("Packaging", 1) |>
    timeout_from_attribute("packaging_time") |>
    release("Packaging", 1)

  # Prepare arrival data with processing time attributes
  arrival_data <- orders |>
    transmute(
      time = ArrivalTime,
      moulding_time = MouldingTime,
      inspection_time = InspectionTime,
      packaging_time = PackagingTime
    )

  # Add a single generator using add_dataframe (efficient)
  env |>
    add_dataframe("order", traj, arrival_data, mon = 2) |>
    run()
  
  return(env)
}

# run simulation with 1 replication
set.seed(123)
orders <- generate_orders(data, production_parameters)
env <- simulate_plant(orders)
# Print results
initial_simulation_results <- get_mon_arrivals(env) |> 
  arrange(start_time) |>  # Order by start_time
  mutate(waiting_time = end_time - activity_time - start_time) 

# Display preview
knitr::kable(
  head(initial_simulation_results),
  caption = "Preview of the Initial Simulation (1 replication)"
)
Table 4: Preview of the Initial Simulation (1 replication)
name start_time end_time activity_time finished replication waiting_time
order0 0.5439524 13.43478 12.890828 TRUE 1 0
order1 1.1209347 15.19540 14.074465 TRUE 1 0
order2 1.8768055 15.10208 13.225277 TRUE 1 0
order3 2.4838564 10.52084 8.036986 TRUE 1 0
order4 3.0967851 8.67225 5.575465 TRUE 1 0
order5 3.8682916 32.74227 28.873981 TRUE 1 0

To ensure meaningful performance insights, it is crucial to exclude the initial “start-up” phase of the simulation, where results may not yet be stable. This period, known as the warm-up period, allows the system to reach a steady state before performance metrics are recorded.

While the ideal method involves calculating a unique warm-up period for each simulation run, a single warm-up value was used for all runs to streamline analysis. This value was determined using the marginal standard error rule (MSER), applied to a baseline simulation with one replication. This approach is accepted by Law (2015, pp. 511–523) when the system’s stochastic behaviour is consistent across runs, offering a practical balance between accuracy and efficiency.

As shown in Figure 6, the system stabilises after the first 280 orders. Based on the MSER heuristic, these initial orders are excluded from the performance analysis to avoid skewed results.

# Compute MSER for each truncation point
compute_mser <- function(data) {
  n <- length(data)
  mser_values <- numeric(n / 2)
  
  for (i in 1:(n / 2)) {
    truncated <- data[i:n]
    mser_values[i] <- sd(truncated) / mean(truncated)
  }
  
  return(mser_values)
}

# Apply MSER heuristic to waiting times
mser_values <- compute_mser(initial_simulation_results$waiting_time)
warmup_mser <- which.min(mser_values)

# Prepare data for plotting
mser_df <- data.frame(
  Order = 1:length(mser_values),
  MSER  = mser_values
)

# Plot MSER values with warm-up indicator
ggplotly(
  ggplot(mser_df, aes(x = Order, y = MSER)) +
    geom_line(colour = "steelblue", alpha = 0.5) +
    geom_vline(xintercept = warmup_mser, colour = "steelblue", linetype = "dashed") +
    annotate(
      "text",
      x = warmup_mser,
      y = max(mser_values),
      label = paste("Warm-up at Order", warmup_mser),
      colour = "steelblue",
      hjust = -5.5
    ) +
    labs(
      title = "MSER Heuristic for Warm-up Period",
      x = "Order",
      y = "MSER Value"
    ) +
    theme_tufte(base_family = "Helvetica") +
    theme(
      plot.title = element_text(hjust = 0.5),
      legend.position = "top",
      axis.text.x = element_text(angle = 45, hjust = 1)
    ) 
)|>
  layout(
    annotations = list(
      text = "Hover over the plot for details",
      x = 0.5,
      y = 1.05,
      xref = "paper",
      yref = "paper",
      showarrow = FALSE,
      font = list(size = 12, family = "Helvetica")
    )
  )

Figure 6: Identifying warm-up orders using MSER heuristic.

Due to the randomness in both order arrivals and processing times, multiple independent replications are required to obtain statistically reliable results. To determine an appropriate number of replications, cumulative mean waiting times were analysed to assess when performance metrics stabilised.

# ---- Parameters ----
set.seed(42)                # For reproducibility
z <- 1.96                   # 95% confidence
w <- 0.2                    # Desired half-width of CI
min_reps <- 100              # Minimum replications before estimating
results <- c()              # To store results
max_reps <- 500             # Safety stop

# ---- Loop to determine required replications ----
repeat {
  orders <- generate_orders(data, production_parameters)
  
  # Simulate environment
  env <- simulate_plant(orders)

  # Get arrivals and calculate waiting time per order (post-warmup)
  arrivals <- get_mon_arrivals(env, per_resource = TRUE) |>
    mutate(
    waiting_time = round(end_time - activity_time - start_time, 2),
    order_id = parse_number(name)) |>
  group_by(replication) |>
  arrange(start_time, .by_group = TRUE) |>
  filter(order_id > warmup_mser) |>
  ungroup()

  # Calculate mean waiting time (only if arrivals exist)
  if (nrow(arrivals) == 0) {
    result <- NA
  } else {
    result <- mean(arrivals$waiting_time, na.rm = TRUE)
  }

  # Store result
  results <- c(results, result)

  # Check if enough clean results to estimate required replications
  if (length(results) >= min_reps) {
    s <- sd(results)
    mean_val <- mean(results)
    n_needed <- ceiling((z * s / w)^2)

    if (length(results) >= n_needed || length(results) >= max_reps) {
      break
    }
  }
}

# ---- Final Output ----
cat(paste(
  " Final replications used:", length(results), "\n",
  "Mean waiting time:", round(mean(results), 3), "\n",
  "95% CI Half-width:", round((z * sd(results)) / sqrt(length(results)), 3), "\n"
))
 Final replications used: 100 
 Mean waiting time: 0.096 
 95% CI Half-width: 0.007 

Figure 7 shows that cumulative mean waiting times stabilised after approximately 50 replications, indicating that further replications would have minimal impact on result accuracy.

# Build cumulative statistics over replications
ci_data <- data.frame(
  Replication = seq_along(results),
  MeanWaitingTime = results
) |>
  mutate(
    CumulativeMean = cummean(MeanWaitingTime),
    SD = sapply(1:n(), function(i) if (i > 1) sd(MeanWaitingTime[1:i]) else 0),
    CI_HalfWidth = 1.96 * SD / sqrt(Replication),
    LowerCI = CumulativeMean - CI_HalfWidth,
    UpperCI = CumulativeMean + CI_HalfWidth
  )

# Plot convergence of average waiting time
ggplotly(
  ggplot(ci_data, aes(x = Replication)) +
    geom_line(aes(y = CumulativeMean), color = "steelblue", size = 0.7) +
    geom_ribbon(aes(ymin = LowerCI, ymax = UpperCI), alpha = 0.2, fill = "steelblue") +
    geom_vline(xintercept = 50, colour = "steelblue", linetype = "dashed") +
    labs(
      title = "Convergence of Average Waiting Time Across Replications",
      x = "Number of Replications",
      y = "Cumulative Average Waiting Time",
      caption = "Shaded area = 95% Confidence Interval"
    ) +
    theme_tufte(base_family = "Helvetica") +
    theme(
      plot.title = element_text(hjust = 0.5),
      legend.position = "right",
      axis.text.x = element_text(angle = 45, hjust = 1)
    )
) |>
  layout(
    annotations = list(
      text = "Hover over the plot for details",
      x = 0.5,
      y = 1.05,
      xref = "paper",
      yref = "paper",
      showarrow = FALSE,
      font = list(size = 12, family = "Helvetica")
    )
  )

Figure 7: Cumulative average waiting time across replications.

To ensure robustness, the simulation performed 50 independent replications, providing sufficient statistical confidence in the reported performance measures.

# Run multiple replications and track arrivals
set.seed(42)
n_reps <- 50

rep_arrivals <- lapply(1:n_reps, function(i) {
  orders <- generate_orders(data, production_parameters)
  env <- simulate_plant(orders)

  get_mon_arrivals(env, per_resource = TRUE) |>
    mutate(replication = i) |>
    mutate(
      waiting_time = round(end_time - activity_time - start_time, 2),
      order_id = parse_number(name)
    ) |>
    group_by(replication) |>
    arrange(start_time, .by_group = TRUE) |>
    filter(order_id > warmup_mser) |>
    ungroup()
})

# Combine results from all replications
all_arrivals <- bind_rows(rep_arrivals)

# Summary output
cat("Replications tracked:", length(unique(all_arrivals$replication)), "\n")
Replications tracked: 50 
# Preview of simulation results
knitr::kable(
  head(all_arrivals),
  caption = "Preview of the Simulation Results (50 replications, excluding warm-up orders"
)
Table 5: Preview of the Simulation Results (50 replications, excluding warm-up orders
name start_time end_time activity_time resource replication waiting_time order_id
order281 168.5482 169.0158 0.4676739 Moulding 1 0.00 281
order281 169.0158 169.8127 0.7968853 Inspection 1 0.00 281
order282 169.0912 169.7658 0.6746243 Moulding 1 0.00 282
order283 169.7548 178.5807 8.8259342 Moulding 1 0.00 283
order282 169.7658 172.0101 2.2442206 Inspection 1 0.00 282
order281 169.8127 173.6532 2.0248702 Packaging 1 1.82 281

Verification & Validation

Verification

Verification ensures the simulation model was developed and implemented correctly, according to the intended design and logic. The model was built using the simmer package in R and carefully reviewed to confirm that it accurately reflects ABC Co.’s real-world production process—moving through the three stages of moulding, inspection, and packaging.

Each stage was programmed as an M/M/c queue, ensuring that:

  • Resources are correctly seized before processing begins,

  • Processing times follow the specified country-based exponential distributions,

  • Resources are released in the proper sequence after completion.

A sample of simulated orders was manually traced through the system to verify the correct model behaviour. This confirmed that:

  • Order arrivals matched the intended normal distribution,

  • Stage durations reflected the country-specific exponential patterns,

  • The flow of products between stages was uninterrupted and logically consistent.

In addition, key performance indicators—such as waiting times, queue lengths, and machine utilisation—were tested under both normal and extreme load conditions to ensure the system responded in a stable and predictable manner.

Validation

Validation evaluates whether the model accurately represents the real-world system or provides a sound approximation based on theoretical expectations. In the absence of detailed operational data for ABC Co., validation was carried out using a combination of statistical checks and theoretical comparisons.

First, the warm-up period was validated by confirming that the first 280 orders were excluded from each simulation run (Figure 8, ensuring performance metrics were only calculated once the system had reached a stable state.

# Plot start time by order ID with warm-up cutoff indicator
ggplot(all_arrivals, aes(x = order_id, y = start_time)) +
  geom_point(color = "lightblue", size=0.2) +
  geom_vline(xintercept = warmup_mser, color = "steelblue", linetype = "dashed") +
  annotate(
    "text",
    x = warmup_mser,
    y = 600,
    label = paste("Order", warmup_mser),
    colour = "steelblue",
    hjust = 0
  ) +
  labs(
    title = "Orders From the Simulation Output",
    x = "Order ID",
    y = "Start Time"
  ) +
  theme_tufte(base_family = "Helvetica") +
  theme(
    plot.title = element_text(hjust = 0.5),
    legend.position = "right",
    axis.text.x = element_text(angle = 45, hjust = 1)
  )
Orders considered in the simulation output.

Figure 8: Orders considered in the simulation output.

Having the correct implementation of input parameters, we then verified that the simulated activity times closely followed their intended exponential distributions (Figure 9).

# Check if activity times follow an exponential distribution
ggplot(all_arrivals, aes(x = activity_time)) +
  geom_histogram(aes(y = ..density..), bins = 30, fill = "steelblue", alpha = 0.5) +
  stat_function(
    fun = dexp,
    args = list(rate = 1 / mean(all_arrivals$activity_time)),
    color = "red"
  ) +
  labs(
    title = "Simulated Activity Times vs Exponential Fit",
    x = "Minutes",
    y = "Density"
  ) +
  theme_tufte(base_family = "Helvetica") +
  theme(
    plot.title = element_text(hjust = 0.5),
    legend.position = "right",
    axis.text.x = element_text(angle = 45, hjust = 1)
  )
Exponential fit of simulated activity times.

Figure 9: Exponential fit of simulated activity times.

To validate the model’s output, simulated performance metrics—such as average waiting times at the inspection stage—were compared with theoretical values derived from standard M/M/c queuing models.

# Load queueing analysis library
library(queueing)

# Summarise simulation results by stage
summary_table <- all_arrivals |>
  group_by(resource) |>
  summarize(
    "Avg Waiting"  = round(mean(waiting_time), 2),
    "Max Waiting"  = round(max(waiting_time), 2),
    "Total Orders" = n(),
    .groups = "drop"
  )

# Define parameters
Mean_IAT        <- 0.6  # Mean inter-arrival time
Mean_Inspection <- mean(production_parameters$Inspection)  # Mean service time for Inspection

# Theoretical queueing model for Inspection stage (multi-server)
i_mm1 <- NewInput.MMC(lambda = 1 / Mean_IAT, mu = 1 / Mean_Inspection, c = inspection_table)
model <- QueueingModel(i_mm1)

# Compare simulated and theoretical waiting times
cat(paste(
  " Simulated Mean Waiting Time in Inspection:", summary_table$`Avg Waiting`[1], "minutes\n",
  "Theoretical Mean Waiting Time in Inspection Queue (Wq):", round(model$Wq, 2), "minutes\n"
))
 Simulated Mean Waiting Time in Inspection: 0.1 minutes
 Theoretical Mean Waiting Time in Inspection Queue (Wq): 0.15 minutes

The simulated average waiting time (0.1 minutes) was close to the theoretical expectation (0.15 minutes), with minor differences attributed to the variability introduced by the mix of country-specific parameters. This confirms the model realistically captures queuing behaviour.

Finally, Figure 10 shows that the distribution of simulated processing times closely matches the expected exponential shape, with a slightly heavier tail. This variation is anticipated due to the combination of different exponential distributions across countries, and it reflects the complexity of real-world operations.

# Simulated activity times from the simulation
simulated_activity_times <- all_arrivals$activity_time

# Theoretical exponential quantiles
theoretical_quantiles <- qexp(
  ppoints(length(simulated_activity_times)),
  rate = 1 / mean(simulated_activity_times)
)

# Q-Q plot to assess exponential fit
qqplot(
  theoretical_quantiles,
  sort(simulated_activity_times),
  main = "Q-Q Plot: Simulated Activity Times vs Exponential Distribution",
  xlab = "Theoretical Exponential Quantiles",
  ylab = "Simulated Activity Times",
  col = "steelblue"
)
abline(0, 1, col = "red", lwd = 1)  # Reference line
Q-Q plot comparing simulated activity times to a theoretical exponential distribution.

Figure 10: Q-Q plot comparing simulated activity times to a theoretical exponential distribution.

Together, these verification and validation steps confirm that the simulation model is both technically sound and a reliable tool for analysing ABC Co.’s production system under different resource configurations.

Key Results

The simulation results offer valuable insights into ABC Co.’s current production system performance.

As shown in Figure 11, the system reaches a steady state relatively quickly, with stable resource usage after the warm-up period. The system successfully processed 1,000 orders in just over 600 minutes, confirming a strong throughput rate under the current configuration. Server usage averaged around 6.5 in moulding, 5.5 in inspection, and 6 in packaging, suggesting that the current resource allocation effectively maintains smooth operations.

# Track resource usage across multiple replications
set.seed(42)

rep_resources <- lapply(1:n_reps, function(i) {
  orders <- generate_orders(data, production_parameters)
  env <- simulate_plant(orders)

  get_mon_resources(env) |>
    mutate(replication = i)
})

# Combine results from all replications
all_resources <- bind_rows(rep_resources)

# Plot resource usage
plot(all_resources)
Resource usage during the initial simulation (1 replication, including warm-up period).

Figure 11: Resource usage during the initial simulation (1 replication, including warm-up period).

Waiting times across stages remained relatively low (Figure 12). Moulding had the shortest average wait (0.04 minutes), followed by inspection (0.10 minutes), while packaging had the highest (0.15 minutes).

# Interactive bar chart of average waiting times per stage
ggplotly(
  summary_table |>
    dplyr::select(resource, `Avg Waiting`) |>
    mutate(resource = factor(resource, levels = c("Moulding", "Inspection", "Packaging"))) |>
    ggplot(aes(x = resource, y = `Avg Waiting`)) +
      geom_col(fill = "steelblue") +
      labs(
        title = "Average Waiting Time per Stage",
        x = "",
        y = "Minutes"
      ) +
      theme_tufte(base_family = "Helvetica") +
      theme(
        plot.title = element_text(hjust = 0.5),
        legend.position = "none",
        axis.text.x = element_text(angle = 45, hjust = 1, size = 9)
      )
) |>
  layout(
    annotations = list(
      text = "Hover over the plot for details",
      x = 0.5,
      y = 1.05,
      xref = "paper",
      yref = "paper",
      showarrow = FALSE,
      font = list(size = 12, family = "Helvetica")
    )
  )

Figure 12: Average waiting time per stage (50 replications, excluding warm-up orders).

While most orders were processed with minimal delay, occasional spikes in waiting times were observed, especially in inspection and packaging (Figure 13), where delays occasionally exceeded 5 minutes. Among the three stages, packaging showed the highest tendency for queuing, indicating it could become a bottleneck under increased demand.

Although inspection generally performs well, it remains critical due to its direct link to degradation costs when delays occur after moulding. Therefore, ongoing monitoring of this stage is essential from an operational and financial perspective.

# Interactive histogram of waiting times by stage
ggplotly(
  all_arrivals |>
    mutate(resource = factor(resource, levels = c("Moulding", "Inspection", "Packaging"))) |>
    ggplot(aes(x = waiting_time)) +
      geom_histogram(binwidth = 1, fill = "steelblue", color = "white") +
      facet_wrap(~resource, scales = "free_y") +
      labs(
        title = "Distribution of Waiting Times per Stage",
        x = "Minutes",
        y = "Frequency"
      ) +
      theme_tufte(base_family = "Helvetica") +
      theme(
        plot.title = element_text(hjust = 0.5),
        legend.position = "right",
        axis.text.x = element_text(angle = 45, hjust = 1)
      )
) 

Figure 13: Distribution of waiting times per stage (50 replications, excluding warm-up orders).

The system effectively meets current demand levels with balanced resource usage and manageable wait times. However, packaging may require reinforcement in future high-demand scenarios, and inspection should remain a focus area due to its cost sensitivity. The simulation model has proven to be a reliable tool for evaluating system performance and will serve as a solid foundation for future scenario analysis and operational planning.

Scenario Analysis

A scenario analysis was conducted to explore how strategic investment in additional equipment could improve operational efficiency and reduce costs.

The study focused on adding six new machines to ABC Co.’s production system, to optimise their distribution between the moulding and packaging stages. Each scenario ensured that both stages received at least one machine.

The five configurations tested were:

The primary objective of this analysis was to minimise degradation costs, which occur when products are delayed between the moulding and inspection stages. Based on the simulation model, these delays are captured by the waiting time at the inspection stage. Every minute of delay translates into a degradation cost of £150 per product.

degradation_cost_per_minute = 150

degradation_data <- all_arrivals |>
  filter(resource=="Inspection" & waiting_time > 0) |>
  dplyr::select(order_id, replication, resource, waiting_time) |>
  mutate(degradation_cost = waiting_time * degradation_cost_per_minute)

cat(paste("Average degradation cost with the default set-up of 10 moulding and 9 packaging machines: £", round(mean(degradation_data$degradation_cost),2)))
Average degradation cost with the default set-up of 10 moulding and 9 packaging machines: £ 125.77

Each configuration was tested using 50 independent simulation runs, with the first 280 orders excluded from analysis in each run to ensure the system had reached a steady state. For each scenario, the average degradation cost was calculated by multiplying waiting time at inspection by the £150/minute penalty.

start_time <- Sys.time()

# Set seed for reproducibility
set.seed(42)

# Function to simulate experiments across different machine configurations
simulate_experiments <- function(orders, machine_combinations, n_replications = n_reps) {
  results <- list()
  
  for (combo in machine_combinations) {
    replication_costs <- numeric(n_replications)
    
    for (rep in 1:n_replications) {
      # Create simulation environment with adjusted resource capacities
      env <- simmer("SolCo Plant") |>
        add_resource("Moulding", capacity = moulding_machine + combo[1]) |>
        add_resource("Inspection", capacity = inspection_table) |>
        add_resource("Packaging", capacity = packaging_machine + combo[2])
      
      # Define process trajectory
      traj <- trajectory("Order Process") |>
        seize("Moulding", 1) |>
        timeout_from_attribute("moulding_time") |>
        release("Moulding", 1) |>
        seize("Inspection", 1) |>
        timeout_from_attribute("inspection_time") |>
        release("Inspection", 1) |>
        seize("Packaging", 1) |>
        timeout_from_attribute("packaging_time") |>
        release("Packaging", 1)
      
      # Generate order data
      orders <- generate_orders(data, production_parameters)
      arrival_data <- orders |>
        transmute(
          time = ArrivalTime,
          moulding_time = MouldingTime,
          inspection_time = InspectionTime,
          packaging_time = PackagingTime
        )
      
      # Run simulation
      env |>
        add_dataframe("order", traj, arrival_data, mon = 2) |>
        run()
      
      # Extract arrivals and compute cost
      arrivals <- get_mon_arrivals(env, per_resource = TRUE) |>
        mutate(
          waiting_time = round(end_time - activity_time - start_time, 2),
          order_id = parse_number(name)
        ) |>
        arrange(start_time, .by_group = TRUE) |>
        filter(order_id > warmup_mser) |>
        filter(resource == "Inspection" & waiting_time > 0)
      
      replication_costs[rep] <- mean(arrivals$waiting_time, na.rm = TRUE) * degradation_cost_per_minute
    }
    
    # Store average cost for each configuration
    results[[paste(combo[1], combo[2], sep = ",")]] <- tibble(
      moulding_machines   = moulding_machine + combo[1],
      packaging_machines  = packaging_machine + combo[2],
      cost                = mean(replication_costs, na.rm = TRUE)
    )
  }
  
  return(bind_rows(results))
}

# Define machine configurations to test
machine_combinations <- list(
  c(1, 5), c(2, 4), c(3, 3), c(4, 2), c(5, 1)
)

# Run experiments
experiment_results <- simulate_experiments(orders, machine_combinations, n_replications = n_reps)

end_time <- Sys.time()
execution_time <- round(difftime(end_time, start_time, units = "secs"), 2)

cat(paste("Simulation completed in", execution_time, "seconds."))
Simulation completed in 4.57 seconds.

Figure 14 shows that among the five tested configurations, only the (1,5) set-up—with one machine added to moulding and five to packaging—achieved better results than the current system. This scenario delivered the lowest average degradation cost, at £120.29, confirming that expanding packaging capacity is the most effective way to reduce delays after moulding.

# Load color palette library
library("RColorBrewer")

# Heatmap of average degradation cost by machine combination
ggplot(experiment_results, aes(x = moulding_machines, y = packaging_machines, fill = cost)) +
  geom_tile() +
  geom_text(aes(label = round(cost, 2)), color = "white", size = 5) +
  scale_fill_gradientn(colors = c("#66C2A5", "#FED976", "#F4A582")) +
  labs(
    title = "Average Degradation Cost (£) per Machine Combination",
    x = "Number of Moulding Machines",
    y = "Number of Packaging Machines",
    fill = "Average Cost (£)"
  ) +
  theme_tufte(base_family = "Helvetica") +
  theme(
    plot.title = element_text(hjust = 0.5),
    legend.position = "none",
    axis.text.x = element_text(angle = 45, hjust = 1)
  )
Average degradation costs across machine combinations.

Figure 14: Average degradation costs across machine combinations.

In contrast, other configurations failed to improve performance and even increased degradation costs. This reinforces the insight that inspection delays are more sensitive to packaging constraints than to moulding capacity.

This scenario analysis demonstrates the value of simulation-based planning. By modelling different capacity expansion strategies, ABC Co. can allocate resources where they generate the greatest financial return and operational impact, while avoiding ineffective investments. The findings strongly support prioritising packaging capacity as the most cost-effective solution for reducing degradation-related losses.

Conclusions

The simulation study confirms that ABC Co.’s current production system operates efficiently under standard conditions, with stable resource utilisation and generally low average waiting times across all three stages. The current set-up can process 1,000 orders in slightly more than 600 minutes. However, variability in arrival and processing times can cause occasional waiting spikes, particularly at the packaging stage. While inspection delays are less frequent, they remain critical due to the degradation cost of £150 per minute.

Scenario analysis showed that only one configuration—adding 1 moulding machine and 5 packaging machines—outperformed the current set-up by reducing degradation costs. All other tested allocations either had no impact or worsened performance.

These findings underscore the importance of strategic capacity planning, particularly when investment decisions aim to reduce inspection delays and degradation costs.

Recomendations

Based on the simulation findings, the following recommendations are proposed to improve operational performance and reduce costs:

  1. Prioritise monitoring of the inspection stage. Although average performance at the inspection stage is relatively strong, any delay here leads directly to product degradation and increased costs. Real-time monitoring can help detect and respond to queue build-ups promptly, preventing quality issues and unnecessary expenses.

  2. Reinforce packaging capacity. Packaging exhibited the highest average and peak waiting times across all stages, making it the most likely bottleneck under increased demand. Adding equipment or optimising operational processes at this stage should be considered to maintain system flow and avoid downstream delays.

  3. Adopt the (1 Moulding, 5 Packaging) expansion strategy. Scenario analysis identified this configuration as the only configuration that outperformed the current set-up, delivering the lowest average degradation cost. If six new machines are to be added, this distribution should be prioritised to maximise operational and financial benefit.

  4. Continue refining the simulation model. The current validated model provides a solid framework for future decision-making. As new data becomes available—whether from shifts in demand, changes in product mix, or updated production practices—it should be integrated to maintain the model’s accuracy and relevance.

  5. Incorporate detailed operational data for continuous improvement. More granular data such as machine downtimes, maintenance schedules, and labour constraints can be incorporated to enhance the model’s precision further. Regular updates will ensure the model remains a valuable tool for ongoing performance analysis and strategic planning.

References

Law, A. (2015). Simulation Modeling and Analysis. Fifth edition. New York: McGraw-Hill Education. Available at: https://industri.fatek.unpatti.ac.id/wp-content/uploads/2019/03/108-Simulation-Modeling-and-Analysis-Averill-M.-Law-Edisi-5-2014.pdf (Accessed: 28 March 2025)

Appendix 1: Session Info

# This command provides detailed information about the current R session, including the version of R, the operating system, and the list of loaded packages along with their versions. It is useful for checking compatibility of packages and diagnosing issues with installed software.
sessionInfo()
R version 4.5.2 (2025-10-31)
Platform: aarch64-apple-darwin20
Running under: macOS Tahoe 26.1

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib 
LAPACK: /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.12.1

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

time zone: Europe/Madrid
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] RColorBrewer_1.1-3 queueing_0.2.12    skimr_2.2.1        simmer.plot_0.1.19
 [5] simmer_4.4.7       knitr_1.50         ggthemes_5.1.0     plotly_4.10.4     
 [9] lubridate_1.9.4    forcats_1.0.0      stringr_1.5.1      purrr_1.0.4       
[13] readr_2.1.5        tidyr_1.3.1        tibble_3.2.1       ggplot2_3.5.2     
[17] tidyverse_2.0.0    dplyr_1.1.4       

loaded via a namespace (and not attached):
 [1] gtable_0.3.6      xfun_0.52         bslib_0.9.0       htmlwidgets_1.6.4
 [5] tzdb_0.5.0        crosstalk_1.2.1   vctrs_0.6.5       tools_4.5.2      
 [9] generics_0.1.4    parallel_4.5.2    cluster_2.1.8.1   pkgconfig_2.0.3  
[13] checkmate_2.3.2   data.table_1.17.4 lifecycle_1.0.4   compiler_4.5.2   
[17] farver_2.1.2      repr_1.1.7        codetools_0.2-20  htmltools_0.5.8.1
[21] sass_0.4.10       yaml_2.3.10       lazyeval_0.2.2    htmlTable_2.4.3  
[25] Formula_1.2-5     pillar_1.10.2     crayon_1.5.3      jquerylib_0.1.4  
[29] cachem_1.1.0      Hmisc_5.2-3       rpart_4.1.24      tidyselect_1.2.1 
[33] digest_0.6.37     stringi_1.8.7     bookdown_0.43     labeling_0.4.3   
[37] fastmap_1.2.0     grid_4.5.2        colorspace_2.1-1  cli_3.6.5        
[41] magrittr_2.0.3    base64enc_0.1-3   foreign_0.8-90    withr_3.0.2      
[45] backports_1.5.0   scales_1.4.0      bit64_4.6.0-1     timechange_0.3.0 
[49] rmarkdown_2.29    httr_1.4.7        bit_4.6.0         nnet_7.3-20      
[53] gridExtra_2.3     hms_1.1.3         evaluate_1.0.3    viridisLite_0.4.2
[57] rlang_1.1.6       Rcpp_1.0.14       glue_1.8.0        rstudioapi_0.17.1
[61] vroom_1.6.5       jsonlite_2.0.0    R6_2.6.1